Introduction

This vignette describes basic usage of the package intended to process several large bedgraph files in R. Methrix provides set of function which allows easy importing of various flavors of bedgraphs generated by methylation callers, and many downstream analysis to be performed on large matrices.

Processing

CpG annotation

As a first step, we need a list of CpG sites in the respective genome. The CpG sites are listed using the respective Bsgenome annotation package. The read_bedgraph function is also able to extract CpG sites on it’s own, however, it might be beneficial to do it separately.

## -Extracting CpGs
## -Done. Extracted 28,700,086 CpGs from 93 contigs.

Reading bedGraph files

read_bedgraphs function is a versatile bedgraph reader intended to import bedgraph files generated virtually by any sort of methylation calling program. It requires user to provide indices for chromosome names, start position and other required fields. There are also presets available to import bedgraphs from most common programs such as Bismark (.cov format), MethylDackel, and MethylcTools. In this case, there is no need to define e.g. chr_idx start_idx arguments, the function will automatically assign them. For the sake of fast analysis, here we will use example dataset from methrix.

## [1] "C1.bedGraph.gz" "C2.bedGraph.gz" "N1.bedGraph.gz" "N2.bedGraph.gz"

The read_bedgraphs function adds CpGs missing from the reference set, and creates a methylation/coverage matrices. Once the process is complete - it returns an object of class methrix which in turn inherits SummarizedExperiment class. methrix object contains ‘methylation’ and ‘coverage’ matrices (either in-memory or as on-disk HDF5 arrays) along with pheno-data and other basic info. This object can be passed to all downstream functions for various analysis. For further details on the data structure, see the SummarizedExperiment package.

  • For computers with large memory, increase speed using vect=TRUE. The function will open multiple files, defined by the vect_batch_size argument. Keep in mind, that increasing the batch size will increase the memory need as well.

  • If vect=FALSE, the function will iterate along the list of files, keeping only one open at a time.

## ----------------------------
## -Preset:        Custom
## --Missing beta and coverage info. Estimating them from M and U values
## -CpGs raw:      28,700,086
## -CpGs filtered: 28,217,448
## ----------------------------
## -Batch:         1/1
## -Processing:    C1.bedGraph.gz
## --CpGs missing: 28,216,771
## -Processing:    C2.bedGraph.gz
## --CpGs missing: 28,216,759
## -Processing:    N1.bedGraph.gz
## --CpGs missing: 28,216,746
## -Processing:    N2.bedGraph.gz
## --CpGs missing: 28,216,747
## -Finished in:  52.4s elapsed (00:01:04 cpu)

Potential pitfalls: Too many missing CpGs, if not expected, might indicate that something went wrong.

  • An obvious error is using the wrong reference genome.

  • A common mistake is falsely indicating strandedness and zero or one based positioning. The combination of the two might lead to only using C-s from one strand (due to a 1 bp shift), resulting lower coverage for the sites. A nice description of the zero- and one-based coordination system can be found here

Initial QC and summary

Without any further processing, we can create an interactive html report containing basic summary statistics of the methrix object with methrix_report function.

In the report, we can check genome-wide and chromosome based statistics on coverage and methylation, in order to identify potential quality issues, for example:

  • samples with too high or too low genome-wide coverage or on a specific chromosome. In this case, there is also a possibility, that the sample is affected by copy number alterations. Both methylation levels and coverage therefore shows higher variation among cancer samples.

  • samples with altered beta level distribution.

  • A “bump” on the beta level density plot at 0.5 indicates the presence of single nucleotide polymorphisms (SNPs) -> see later.

Filtering

Coverage based filtering

To ensure the high quality of our dataset, the sites with very low (untrustworthy methylation level) or high coverage (technical problem) should not be used. We can mask these CpG sites. Please note, that the DSS we were using for Differential Methylation Region (DMR) calling, doesn’t need the removal of the lowly covered sites, because it takes it into account by analysis. However, using a not too restrictive threshold won’t interfere with DMR calling.

## -Masked 318 CpGs due to low coverage.
## -Masked 1 CpGs due to too high coverage in sample C1.
## -Masked 5 CpGs due to too high coverage in sample C2.
## -Masked 5 CpGs due to too high coverage in sample N1.
## -Masked 5 CpGs due to too high coverage in sample N2.
## -Finished in:  6.690s elapsed (4.550s cpu)

Later we can remove those sites that are not covered by any of the samples.

## -Removed 28,216,735 [100%] uncovered loci of 28,217,448 sites
## -Finished in:  14.0s elapsed (18.6s cpu)

We might want to remove the sites that are sparsely covered. If we plan to do differential analysis, we can do it group-wise, or using the whole population.

## -Retained 430 of 713 sites
## -Finished in:  0.920s elapsed (0.610s cpu)

SNP filtering

SNPs, mostly the C > T mutations can disrupt methylation calling. Therefore it is essential to remove CpG sites that overlap with common variants, if we have variation in our study population. For example working with human, unpaired data, e.g. treated vs. untreated groups. During filtering, we can select the population of interest and the minimum minor allele frequency (MAF) of the SNPs. SNP filtering is currently implemented for hg19 and hg38 assemblies.

## Used SNP database: MafDb.1Kgenomes.phase3.hs37d5.
## Number of SNPs removed:
##      chr  N
## 1: chr21 23
## 2: chr22 30
## Sum:
## [1] 53
  • There is a possibility to keep the sites possibly overlapping with SNPs. We can use this data for detecting possible sample mismatches. For this, we need to plot the most variable sites (probably the ones that are variable in our study population) and plot them on a heatmap.

  • SNP removal in mouse data: Usually, there is no need to remove SNPs from mouse data, since for many experiments, the same strain of mice is used. However, there are cases, when different crosses of strains are used, for which, the level of heterozygosity might be different. SNP data for mice can be downloaded from here: https://www.sanger.ac.uk/science/data/mouse-genomes-project. The SNPs can be filtered out using the region filter function.

Visualization and QC after filtering

Methrix report

After filtering, it worth running a report again, to see if any of the samples were so lowly covered that it warrants an action (e.g. removal of the sample)

Important: Don’t forget to set the recal_stats to TRUE, since the object changed since reading in. The output directory has to be different from last time, to avoid using the intermediate files calculated during the last run.

There are additional possibilities to visualize the study. We can look at the density of coverage both sample- and group-wise:

We can visualize te beta value distribution as a violin plot or density plot. These plots (as well as the coverage plot) use the most variable 25000 CpG sites to ensure fast processing. plot_density and plot_violin has the option to use data only from a restricted region.

Dimensionality reduction

Methrix offers principal component analysis (PCA) to conduct and visualize dimensionality reduction. As a first step, the PCA model has to be calculated. As for plotting, a given number of sites are selected, either random or based on variance (var option). This ensures that the calculations remain feasible.

At visualization, we can provide the methrix object to allow color or shape annotation of groups or samples.

methrix offers the possibility of region based filtering. With this option, selected regions, e.g. promoters can be visualized. We will use the annotatr package to assess basic annotation categories.

## GRanges object with 82960 ranges and 5 metadata columns:
##                 seqnames        ranges strand |             id       tx_id
##                    <Rle>     <IRanges>  <Rle> |    <character> <character>
##       [1]           chr1   10874-11873      + |     promoter:1  uc001aaa.3
##       [2]           chr1   10874-11873      + |     promoter:2  uc010nxq.1
##       [3]           chr1   10874-11873      + |     promoter:3  uc010nxr.1
##       [4]           chr1   68091-69090      + |     promoter:4  uc001aal.1
##       [5]           chr1 320084-321083      + |     promoter:5  uc001aaq.2
##       ...            ...           ...    ... .            ...         ...
##   [82956] chrUn_gl000237     2687-3686      - | promoter:82956  uc011mgu.1
##   [82957] chrUn_gl000241   36876-37875      - | promoter:82957  uc011mgv.2
##   [82958] chrUn_gl000243   10501-11500      + | promoter:82958  uc011mgw.1
##   [82959] chrUn_gl000243   12608-13607      + | promoter:82959  uc022brq.1
##   [82960] chrUn_gl000247     5817-6816      - | promoter:82960  uc022brr.1
##               gene_id      symbol                 type
##           <character> <character>          <character>
##       [1]   100287102     DDX11L1 hg19_genes_promoters
##       [2]   100287102     DDX11L1 hg19_genes_promoters
##       [3]   100287102     DDX11L1 hg19_genes_promoters
##       [4]       79501       OR4F5 hg19_genes_promoters
##       [5]        <NA>        <NA> hg19_genes_promoters
##       ...         ...         ...                  ...
##   [82956]        <NA>        <NA> hg19_genes_promoters
##   [82957]        <NA>        <NA> hg19_genes_promoters
##   [82958]        <NA>        <NA> hg19_genes_promoters
##   [82959]        <NA>        <NA> hg19_genes_promoters
##   [82960]        <NA>        <NA> hg19_genes_promoters
##   -------
##   seqinfo: 93 sequences (1 circular) from hg19 genome

Differential methylation calling

DSS

A very important step of a WGBS data analysis is the differential methylation calling. During this step, we would like to identify changes between groups or conditions first on the site level. Then, we will need to identify larger regions (DMR-s, differentially methylated regions) affected by methylation changes that are more likely to represent functional alterations. Methrix doesn’t have a DMR caller, therefore we will use DSS. At this stage, DSS is not yet able to work directly with methrix, therefore we transform our methrix object to bsseq for the sake of DMR calling.

For more detailed description, options and recommendations on smoothing, please refer to the ´DSS´ vignette

Call DMRs and DMLs

DMRs:

chr start end length nCG meanMethy1 meanMethy2 diff.Methy areaStat
14 chr22 49004740 49006235 1496 31 0.5781963 0.8942923 -0.3160959 -107.20207
5 chr21 47518792 47519087 296 16 0.6997633 0.1373555 0.5624078 98.88005
8 chr21 47537796 47538662 867 16 0.4827543 0.8208406 -0.3380862 -45.88851
7 chr21 47536948 47537698 751 16 0.4975329 0.7797307 -0.2821978 -42.37607
11 chr22 45085101 45085453 353 10 0.4298283 0.8259031 -0.3960749 -40.41343
4 chr21 47286319 47287013 695 12 0.5860615 0.8665249 -0.2804633 -34.55679
1 chr21 27867791 27868434 644 9 0.4111249 0.7517059 -0.3405811 -29.67288
12 chr22 45086168 45086541 374 7 0.5305423 0.7868741 -0.2563318 -18.98492
13 chr22 49004424 49004642 219 7 0.6718556 0.8557647 -0.1839091 -14.43668
6 chr21 47520470 47520710 241 5 0.3155754 0.6228646 -0.3072891 -13.86936
2 chr21 27869149 27869648 500 4 0.4185388 0.8134162 -0.3948774 -12.97328
17 chr22 49007313 49007398 86 5 0.7567327 0.9309532 -0.1742205 -10.84714
10 chr21 47540244 47540455 212 5 0.5135043 0.7357068 -0.2222025 -10.61340

DMLs:

chr pos mu1 mu2 diff diff.se stat phi1 phi2 pval fdr postprob.overThreshold
168 chr21 47518998 0.6992837 0.1374641 0.5618196 0.0853889 6.579540 0.0610930 0.0446073 0.0000000 0.0000000 1.0000000
155 chr21 47518792 0.7054168 0.1202545 0.5851623 0.0913289 6.407197 0.0514057 0.0591479 0.0000000 0.0000000 0.9999999
156 chr21 47518800 0.7054168 0.1202545 0.5851623 0.0919110 6.366621 0.0431757 0.0628077 0.0000000 0.0000000 0.9999999
157 chr21 47518812 0.6916001 0.1352522 0.5563479 0.0890065 6.250641 0.0479340 0.0607602 0.0000000 0.0000000 0.9999999
163 chr21 47518896 0.6992837 0.1374641 0.5618196 0.0901120 6.234678 0.0486725 0.0592783 0.0000000 0.0000000 0.9999999
161 chr21 47518867 0.6992837 0.1374641 0.5618196 0.0901704 6.230640 0.0437346 0.0430716 0.0000000 0.0000000 0.9999998
167 chr21 47518935 0.6992837 0.1374641 0.5618196 0.0902003 6.228576 0.1511329 0.0570187 0.0000000 0.0000000 0.9999998
164 chr21 47518917 0.6992837 0.1374641 0.5618196 0.0906481 6.197809 0.0426914 0.0494803 0.0000000 0.0000000 0.9999998
162 chr21 47518877 0.6992837 0.1374641 0.5618196 0.0908797 6.182011 0.0428864 0.0440936 0.0000000 0.0000000 0.9999998
160 chr21 47518851 0.6992837 0.1374641 0.5618196 0.0911002 6.167053 0.0626699 0.0640963 0.0000000 0.0000000 0.9999998
170 chr21 47519087 0.7024413 0.1588468 0.5435945 0.0882510 6.159642 0.0610382 0.0503966 0.0000000 0.0000000 0.9999998
159 chr21 47518846 0.6992837 0.1374641 0.5618196 0.0924657 6.075977 0.0816615 0.0626637 0.0000000 0.0000000 0.9999997
166 chr21 47518926 0.6992837 0.1374641 0.5618196 0.0931339 6.032382 0.0426382 0.0509624 0.0000000 0.0000000 0.9999996
165 chr21 47518924 0.6992837 0.1374641 0.5618196 0.0939471 5.980171 0.0405490 0.0526171 0.0000000 0.0000001 0.9999996
158 chr21 47518842 0.6992837 0.1374641 0.5618196 0.0942631 5.960121 0.0471687 0.0643234 0.0000000 0.0000001 0.9999995
169 chr21 47519054 0.6992169 0.1509747 0.5482422 0.0940867 5.826988 0.0449055 0.0479805 0.0000000 0.0000001 0.9999991
335 chr22 49005349 0.4514889 0.9044401 -0.4529512 0.0851709 -5.318149 0.1087880 0.0362448 0.0000001 0.0000023 0.9999829
337 chr22 49005393 0.4382166 0.8942026 -0.4559860 0.0888252 -5.133523 0.0530798 0.0513445 0.0000003 0.0000059 0.9999693
338 chr22 49005405 0.4303463 0.8911297 -0.4607835 0.0901308 -5.112387 0.0510360 0.0441204 0.0000003 0.0000062 0.9999687
336 chr22 49005371 0.4514889 0.9044401 -0.4529512 0.0896909 -5.050135 0.0607954 0.0643722 0.0000004 0.0000082 0.9999584
340 chr22 49005581 0.4563350 0.8954013 -0.4390663 0.0889273 -4.937363 0.0498288 0.0539334 0.0000008 0.0000141 0.9999313
339 chr22 49005551 0.4563350 0.8954013 -0.4390663 0.0896094 -4.899778 0.0447945 0.0412302 0.0000010 0.0000163 0.9999228
275 chr22 45085101 0.3625507 0.8274930 -0.4649423 0.0981511 -4.737004 0.0449663 0.0552770 0.0000022 0.0000352 0.9998997
19 chr21 27869149 0.0448313 0.6676962 -0.6228649 0.1408678 -4.421627 0.0529055 0.0379350 0.0000098 0.0001142 0.9998972
334 chr22 49005299 0.4493985 0.8826734 -0.4332749 0.0917291 -4.723417 0.0414075 0.0545687 0.0000023 0.0000360 0.9998601
276 chr22 45085113 0.3581053 0.8077136 -0.4496084 0.0964650 -4.660842 0.0667119 0.0914574 0.0000031 0.0000452 0.9998551
15 chr21 27868252 0.3731671 0.7994402 -0.4262731 0.0908086 -4.694194 0.0484659 0.0609671 0.0000027 0.0000399 0.9998365
279 chr22 45085158 0.3581053 0.8077136 -0.4496084 0.0985216 -4.563553 0.0471944 0.0513928 0.0000050 0.0000695 0.9998063
278 chr22 45085153 0.3581053 0.8077136 -0.4496084 0.1006989 -4.464879 0.0424381 0.1077930 0.0000080 0.0001017 0.9997416
277 chr22 45085130 0.3581053 0.8077136 -0.4496084 0.1011552 -4.444737 0.1017411 0.0866417 0.0000088 0.0001059 0.9997261
333 chr22 49005192 0.4634613 0.8820300 -0.4185687 0.0938413 -4.460388 0.0540585 0.0785553 0.0000082 0.0001017 0.9996566
280 chr22 45085308 0.4505152 0.8320146 -0.3814995 0.0853803 -4.468239 0.0566719 0.0613755 0.0000079 0.0001017 0.9995114
344 chr22 49005951 0.4923568 0.8833153 -0.3909585 0.0894254 -4.371894 0.0434765 0.0414754 0.0000123 0.0001392 0.9994303
282 chr22 45085355 0.4557529 0.8537278 -0.3979749 0.0976251 -4.076564 0.0589676 0.0787979 0.0000457 0.0005014 0.9988645
332 chr22 49005145 0.5006768 0.8829753 -0.3822986 0.0946697 -4.038236 0.1013749 0.0462659 0.0000539 0.0005739 0.9985680
14 chr21 27868217 0.3824057 0.7921241 -0.4097185 0.1051999 -3.894665 0.1422973 0.0704692 0.0000983 0.0009913 0.9983811
281 chr22 45085330 0.4505152 0.8320146 -0.3814995 0.0970732 -3.930019 0.0406919 0.0453350 0.0000849 0.0008801 0.9981338
13 chr21 27868022 0.3188341 0.7096548 -0.3908207 0.1064360 -3.671886 0.0393516 0.2277589 0.0002408 0.0022452 0.9968578
331 chr22 49005139 0.5171680 0.8806450 -0.3634770 0.0974334 -3.730517 0.0710156 0.0676265 0.0001911 0.0018276 0.9965773
11 chr21 27867985 0.3030118 0.6966354 -0.3936236 0.1101373 -3.573936 0.0469781 0.0423628 0.0003517 0.0031853 0.9961655
215 chr21 47537470 0.5037846 0.8382309 -0.3344463 0.0889507 -3.759906 0.0528036 0.0588836 0.0001700 0.0016685 0.9958022
231 chr21 47538444 0.4052968 0.8266158 -0.4213191 0.1221300 -3.449759 0.0997590 0.0563437 0.0005611 0.0043601 0.9957526
229 chr21 47538409 0.3963148 0.8139883 -0.4176736 0.1208318 -3.456653 0.1688207 0.0444264 0.0005469 0.0043405 0.9957281
230 chr21 47538440 0.4052968 0.8266158 -0.4213191 0.1228531 -3.429453 0.0535200 0.0440788 0.0006048 0.0046039 0.9955558
24 chr21 47284793 0.4628484 0.8460420 -0.3831936 0.1082997 -3.538270 0.0655654 0.0541894 0.0004028 0.0034143 0.9955415
25 chr21 47284838 0.4628484 0.8460420 -0.3831936 0.1089589 -3.516863 0.0610756 0.0536466 0.0004367 0.0036196 0.9953310
12 chr21 27868007 0.3188341 0.7096548 -0.3908207 0.1148248 -3.403625 0.0364220 0.0762724 0.0006650 0.0047700 0.9943507
23 chr21 47284703 0.4628484 0.8460420 -0.3831936 0.1121245 -3.417571 0.0700882 0.0627163 0.0006318 0.0047134 0.9942349
227 chr21 47538333 0.3809894 0.8009985 -0.4200091 0.1268170 -3.311932 0.0604774 0.0780869 0.0009265 0.0061714 0.9942092
57 chr21 47286743 0.4991447 0.8612133 -0.3620685 0.1063704 -3.403845 0.0631799 0.0558895 0.0006644 0.0047700 0.9931321
345 chr22 49005995 0.5987661 0.9083944 -0.3096283 0.0871067 -3.554586 0.0425805 0.0454939 0.0003786 0.0032839 0.9919497
233 chr21 47538549 0.4324684 0.8125794 -0.3801110 0.1181722 -3.216586 0.3041920 0.0955143 0.0012973 0.0080646 0.9911389
228 chr21 47538362 0.4161624 0.8146878 -0.3985254 0.1261927 -3.158070 0.1748563 0.0744497 0.0015882 0.0092561 0.9910393
9 chr21 27867791 0.4843264 0.7809081 -0.2965817 0.0831049 -3.568764 0.0592143 0.5521032 0.0003587 0.0031853 0.9909972
346 chr22 49006004 0.5987661 0.9083944 -0.3096283 0.0890551 -3.476816 0.0851953 0.0467683 0.0005074 0.0041144 0.9907137
58 chr21 47286758 0.4991447 0.8612133 -0.3620685 0.1116282 -3.243523 0.0838537 0.0521010 0.0011806 0.0077258 0.9905723
232 chr21 47538540 0.4324684 0.8125794 -0.3801110 0.1193858 -3.183887 0.2393464 0.0505929 0.0014531 0.0087421 0.9905476
199 chr21 47520710 0.2584592 0.5953496 -0.3368904 0.1009593 -3.336894 0.0455289 0.0717223 0.0008472 0.0057456 0.9905295
235 chr21 47538572 0.4324684 0.8125794 -0.3801110 0.1198739 -3.170923 0.3018380 0.0414823 0.0015196 0.0089967 0.9903039
330 chr22 49005081 0.5468489 0.8680866 -0.3212377 0.0953684 -3.368387 0.2110341 0.0579455 0.0007561 0.0053212 0.9898297
343 chr22 49005781 0.5229578 0.8432075 -0.3202497 0.0956307 -3.348816 0.0721138 0.0478057 0.0008116 0.0056059 0.9893697
234 chr21 47538562 0.4324684 0.8125794 -0.3801110 0.1219200 -3.117707 0.1623588 0.0558222 0.0018226 0.0101469 0.9892459
59 chr21 47286763 0.5460588 0.8826394 -0.3365807 0.1043409 -3.225778 0.0570088 0.0527843 0.0012563 0.0079424 0.9883308
214 chr21 47537365 0.5318998 0.8468988 -0.3149990 0.0983314 -3.203441 0.0440748 0.0741262 0.0013580 0.0083036 0.9856217
62 chr21 47286790 0.6103130 0.9093437 -0.2990307 0.0923822 -3.236888 0.1929197 0.3636197 0.0012084 0.0077713 0.9844045
236 chr21 47538610 0.4932530 0.8423031 -0.3490501 0.1170936 -2.980950 0.0904863 0.0534684 0.0028736 0.0144843 0.9833499
217 chr21 47537522 0.5333436 0.8461447 -0.3128012 0.1000745 -3.125683 0.0915707 0.0748481 0.0017739 0.0100254 0.9832847
329 chr22 49005011 0.5737763 0.8802274 -0.3064512 0.0977632 -3.134628 0.0621902 0.0378817 0.0017207 0.0098743 0.9826621
20 chr21 27869506 0.5431080 0.8619896 -0.3188816 0.1043744 -3.055170 0.1409278 0.0467053 0.0022493 0.0118267 0.9820367
60 chr21 47286771 0.5862985 0.8980315 -0.3117331 0.1010982 -3.083468 0.0392907 0.0534054 0.0020460 0.0110604 0.9819080
216 chr21 47537517 0.5333436 0.8461447 -0.3128012 0.1023925 -3.054922 0.0647327 0.0539652 0.0022512 0.0118267 0.9811863
198 chr21 47520632 0.2584592 0.5953496 -0.3368904 0.1142489 -2.948741 0.0688145 0.0501405 0.0031907 0.0154563 0.9810009
61 chr21 47286783 0.6103130 0.9093437 -0.2990307 0.0968681 -3.086988 0.0415506 0.0529952 0.0020220 0.0110604 0.9800629
22 chr21 27869648 0.5431080 0.8619896 -0.3188816 0.1105168 -2.885368 0.0810954 0.0385831 0.0039096 0.0184591 0.9762531
341 chr22 49005673 0.5557180 0.8783041 -0.3225861 0.1130008 -2.854724 0.1850322 0.0487802 0.0043074 0.0200834 0.9756600
212 chr21 47537252 0.4309434 0.7627531 -0.3318097 0.1187513 -2.794156 0.0875586 0.0456337 0.0052035 0.0235306 0.9746728
342 chr22 49005702 0.5387623 0.8470871 -0.3083247 0.1084777 -2.842287 0.1666430 0.0527486 0.0044791 0.0206261 0.9726823
63 chr21 47286854 0.6285885 0.8950792 -0.2664907 0.0883737 -3.015497 0.2101066 0.0636676 0.0025656 0.0132912 0.9702302
211 chr21 47537232 0.4309434 0.7627531 -0.3318097 0.1232631 -2.691883 0.0997421 0.0830046 0.0071050 0.0288061 0.9702174
10 chr21 27867918 0.4843264 0.7809081 -0.2965817 0.1062405 -2.791606 0.1074741 0.0532601 0.0052447 0.0235306 0.9679628
21 chr21 27869534 0.5431080 0.8619896 -0.3188816 0.1221247 -2.611114 0.1116294 0.0393736 0.0090248 0.0349723 0.9637577
287 chr22 45086364 0.4888850 0.7693777 -0.2804927 0.1016514 -2.759358 0.0528480 0.0458265 0.0057915 0.0253349 0.9621918
210 chr21 47537208 0.4435084 0.7425336 -0.2990252 0.1146361 -2.608474 0.7582071 0.0729645 0.0090947 0.0349723 0.9589812
64 chr21 47286962 0.6729210 0.9111849 -0.2382639 0.0800606 -2.976043 0.0780957 0.0443244 0.0029199 0.0145218 0.9579272
290 chr22 45086516 0.6076341 0.8390252 -0.2313910 0.0772711 -2.994535 0.0467562 0.0475440 0.0027486 0.0140444 0.9554802
195 chr21 47520470 0.3536529 0.6412079 -0.2875550 0.1111714 -2.586592 0.0723913 0.0412574 0.0096930 0.0368929 0.9544510
237 chr21 47538637 0.5470838 0.8449563 -0.2978725 0.1177522 -2.529654 0.1436890 0.0405465 0.0114175 0.0409493 0.9539255
291 chr22 45086539 0.6076341 0.8390252 -0.2313910 0.0783440 -2.953526 0.0515093 0.0495863 0.0031417 0.0154190 0.9532506
249 chr21 47539196 0.6531351 0.8994914 -0.2463563 0.0883547 -2.788265 0.1681109 0.0860229 0.0052991 0.0235306 0.9512303
288 chr22 45086376 0.4888850 0.7693777 -0.2804927 0.1095869 -2.559545 0.0431374 0.0476229 0.0104809 0.0387068 0.9504820
292 chr22 45086541 0.6076341 0.8390252 -0.2313910 0.0796762 -2.904141 0.0515093 0.0405770 0.0036826 0.0176105 0.9504485

Visualizing DMRs and candidate regions

Number and size of DMRs

Region annotation

To describe and interpret the differentially methylated regions, we can use the ChIPseeker package. The location of the DMRs can already give us a hint the involved processes, but once we assign the regions to genes, a pathway enrichment analysis is also possible. For additional details, see the vignette of ChIPseeker:

Working with HDF5-based object

read_bedgraph function

For memory efficient read in, one can use an HDF5 based methrix object. Only one bedgraph file is in the memory at the same time, while the resulting object won’t be stored in the memory, but on-disk. Additional arguments to use HDF5: - Set h5=TRUE - Set vect=FALSE - h5_dir –> a directory to save the final object. It is possible to save the object later. It increases the processing time significantly.
- h5temp –> a temporary directory to use during data processing. Set this if for example the default temporary location doesn’t have enough free space to store the temporary data.

## ----------------------------
## -Preset:        Custom
## --Missing beta and coverage info. Estimating them from M and U values
## -CpGs raw:      28,700,086
## -CpGs filtered: 28,217,448
## ----------------------------
## -Processing:    C1.bedGraph.gz
## --CpGs missing: 28,216,771
## -Processing:    C2.bedGraph.gz
## --CpGs missing: 28,216,759
## -Processing:    N1.bedGraph.gz
## --CpGs missing: 28,216,746
## -Processing:    N2.bedGraph.gz
## --CpGs missing: 28,216,747
## -Finished in:  00:01:10 elapsed (00:01:21 cpu)

Working with HDF5-based objects

All methrix functions work with HDF5-based objects as well, there is no difference in using different functions.

## -Removed 28,216,705 [100%] uncovered loci of 28,217,448 sites
## -Finished in:  39.5s elapsed (42.4s cpu)

It is also possible to transform non-HDF5-based objects to HDF5-based ones and back.

Saving and loading

Saving and loading of an HDF5-based object is not possible using the standard save or saveRDS functions. methrix offers easy to use saving and loading tools, which are essentially wrappers around the saveHDF5SummarizedExperiment and loadHDF5SummarizedExperiment functions.